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Abstract 

Amorphous computing is the study of programming ultra-scale computing environments 
of smart sensors and actuators [1]. The individual elements are identical, asynchronous, 
randomly placed, unreliable, embedded and communicate with a small local neighborhood 
via wireless broadcast. In such environments, where individual processors have limited 
resources, aggregating the processors into groups is useful for specialization, increased ro- 
bustness, and efficient resource allocation. 

This paper presents a new algorithm, called the clubs algorithm, for efficiently aggre- 
gating processors into groups in an amorphous computer, in time proportional to the local 
density of processors. The clubs algorithm takes advantage of the local broadcast commu- 
nication model of the amorphous computer and is efficient in an asynchronous setting. In 
addition, the algorithm derives two properties from the physical embedding of the amor- 
phous computer: an upper bound on the number of groups formed and a constant upper 
bound on the density of groups. The clubs algorithm forms a general mechanism for sym- 
metry breaking and can be extended to find the maximal independent set (MIS) and A + 1 
vertex coloring in 0(log N) rounds, where TV is the total number of elements and A is the 
maximum degree. Simulation results and example applications of clubs are also presented. 
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1 Introduction 

Recent developments in micro-fabrication and 
nanotechnology will enable the inexpensive man- 
ufacturing of massive numbers of tiny comput- 
ing elements with integrated sensors and actua- 
tors. These computing and sensing agents can 
be applied to surfaces or embedded in structures 
to create active surfaces, improved materials and 
responsive environments [4, 8]. Amorphous com- 
puting [1] is the study of such ultra-scale com- 
puting environments, where the individual el- 
ements are bulk manufactured, randomly and 
densely distributed in the material, and commu- 
nicate wirelessly within a small local neighbor- 
hood. The objective of this research is to de- 
termine paradigms for coordinating the behavior 
and local interactions of millions of processing el- 
ements to achieve global goals. In such environ- 
ments, where individual processors have limited 
resources, aggregating processors into groups is 
a useful paradigm for programming. Groups can 
be used for increased robustness, task specializa- 
tion and efficient resource allocation [2, 3, 5]. 

This paper presents a new algorithm, called 
clubs, for efficiently organizing an amorphous 
computer into groups. The local wireless broad- 
cast and the asynchronicity of the amorphous 
computing elements make it difficult and inef- 
ficient to implement group forming algorithms 
that are designed for synchronous point-to-point 
networks, such as [3, 9]. The clubs algorithm 
forms groups in time proportional to the local 
density of processors by taking advantage of the 
local broadcast mechanism. The algorithm can 
be extended to the asynchronous environment 
without the use of complex synchronization and 
without sacrificing efficiency. 

The algorithm also satisfies many constraints 
that arise in other distributed environments such 
as not having access to global IDs or knowledge 
of the topology. We show how the algorithm 
can be extended to deal with processor failure. 
In addition, two interesting bounds on the clubs 
algorithm can be derived from the physical em- 
bedding of the amorphous computer - an upper 
bound on the number of groups formed by the 
clubs algorithm and a maximum density with 
which these groups can be packed, irrespective 
of the total number of processors. We present 
results from running the clubs algorithm on an 
amorphous computer simulation to support the 
analysis. 

The clubs algorithm also provides a general 



mechanism for symmetry breaking in an amor- 
phous computer. We show how the clubs algo- 
rithm can be extended to solve traditional dis- 
tributed computing problems like MIS and A + 1 
vertex coloring in O(logiV) rounds, where N is 
the total number of elements and A is the max- 
imum degree. Lastly we present some example 
applications of using the clubs algorithm to self- 
organize non-local and point-to-point communi- 
cation infrastructure on top of the local broadcast 
mechanism. 

Section 2 presents the model for an amorphous 
computer. Section 3 presents the clubs algo- 
rithm. Section 4 presents the analysis of the al- 
gorithm with synchronous, asynchronous and un- 
reliable processors. Section 5 presents the prop- 
erties derived from the physical embedding. Sec- 
tion 6 presents simulation results. In Section 7 
we show how the clubs algorithm can be extended 
to solve for MIS and A + 1 coloring. Section 8 
presents example applications of the clubs algo- 
rithm. 



2 Computational Model 

In this section we describe the model for an amor- 
phous computer. This model, and the intuition 
behind it, are presented in more detail in [1]. An 
amorphous computer consists of myriad process- 
ing elements. The processing elements: 

• are bulk manufactured and identical. They 
run the same program and do not have 
knowledge about the global topology. 

• have limited computing resources. 

• do not have globally unique identifiers but 
instead have random number generators for 
breaking symmetry. 

• are asynchronous. They have similar clocks 
speeds but do not operate in lockstep. 

• are unreliable. A processor may stop execut- 
ing at any time - this failure model is known 
as stopping failures [10]. 

• have no precise interconnect. The processors 
are randomly and densely distributed. We 
assume that the density is sufficiently high 
so that all processors are connected and the 
variance in density is low. 

• The processors are distributed on a surface 
or in a volume. Processors occupy physical 



space and cannot be packed arbitrarily close. 
In this case we assume that the processors 
are placed on a two dimensional plane. 

These features make it possible to cheaply 
manufacture and program large quantities of 
smart elements, and embed them in materials. 
What makes an amorphous computer different 
from traditional distributed and parallel comput- 
ers is the physical embedding of the amorphous 
computer and the communications model. 

• Processors communicate only with physi- 
cally nearby processors. Each processor 
communicates locally with all processors 
within a circular region of radius r. The lo- 
cal neighborhood size is much smaller than 
the total number of processors and r is much 
smaller than the dimensions of the surface. 

• Processors communicate with their local 
neighborhood by wireless broadcast. All 
processors share the same channel. There- 
fore collisions occur when two processors 
with overlapping broadcast regions send 
messages simultaneously. Collisions result in 
both messages being lost. A processor listen- 
ing to the channel can detect a collision be- 
cause it receives a garbled message. However 
the sender can not detect a collision because 
it it can not listen and transmit at the same 
time. This model is similar to that of multi- 
hop broadcast networks such as packet radio 
networks [11]. 

This communication model allows for the sim- 
ple assembly of large numbers of nodes. How- 
ever, the interference due to overlapping broad- 
cast regions, lack of collision detection and asyn- 
chronicity of the processors make it difficult and 
inefficient to emulate point to point networks. 
In environments where processors have limited 
individual resources and are unreliable, assem- 
bling them into groups is advantageous. However 
group forming algorithms such as [3, 9], that are 
designed for point-to-point networks, are difficult 
to implement without a huge loss in efficiency. In 
addition such algorithms often require synchro- 
nizers to function correctly in asynchronous envi- 
ronments. Typical synchronizing techniques gen- 
erate large numbers of messages [10], further ex- 
acerbating the problem of message loss. Hence al- 
gorithms designed for synchronous point-to-point 
distributed computers do not easily extend to the 
amorphous computing environment. 



In the next sections we present the clubs al- 
gorithm, that takes advantage of the broadcast 
nature of the communications model and asyn- 
chronicity of the processors. The remainder of 
this section provides the notation that will be 
used in analyzing the performance of amorphous 
computing algorithms. 

The amorphous computer can be represented 
as a graph where the nodes of the graph, V, are 
the processors and TV = \V\. From the commu- 
nication model, the set of edges E = {(i, j)\i, j e 
V A distance (i,j) < r}. Hence an edge has amax- 
imum physical distance of r associated with it. 
Each processor has a local neighborhood defined 
by its communication region. For a processor i, 
its local neighborhood is n(i) = {j\(i,j) G E} 
and its degree in the graph is the size of its neigh- 
borhood, d(i) = \n(i)\. The average neighbor- 
hood size is d avg , where d avg -C N. The number 
ofedges|E| = (d aofl iV)/2. 

Processors occupy physical space and can not 
be placed on top of each other in a two dimen- 
sional plane. Therefore, there is a limit, p ma x, on 
the number of processors that can fit in a com- 
munication region. p ma x is a physical constant 
and is equal to (irr 2 /processor size). Hence p max 
provides a physical upper bound on the degree of 
any processor, i.e. d(i) < p ma x, for all processors 
i. In addition, the number of neighbors within h 
hops of a processor is physically upper bounded 
by h? 'Pmax m a two dimensional plane. 



3 Clubs Algorithm 

The objective is to aggregate processors into 
groups. The groups formed should have three 
main properties. 

1. All processors should belong to some group. 

2. All groups should have the same maximum 
diameter. 

3. A group should have local routing [2], which 
means that all processors within the group 
should be able to talk to each other using 
only processors within that same group. 

These features make groups useful for re- 
source allocation and self-organizing communi- 
cation networks [5]. The clubs algorithm forms 
groups, called clubs, with a maximum diameter 
of two hops. We will first describe this algo- 
rithm assuming that the processors are reliable 



integer R (upper bound for random numbers) 
boolean leader, follower = false 

procedure Clubs () 

1 U : = R 

2 r, := random [0,R) 

3 while (not follower and not leader) 

4 ii = ii - 1 

5 if (r; > 0) 

6 r, := r, - 1 

7 if (not_empty(ms^_g''ueMe)) 

8 if (nrst(ms<7_(/MeMe) = "recruit") 

9 follower := true 

10 else 

11 leader := true 

12 broadcast ("recruit") 

13 while (t t > 0) 

14 listen for other leaders 

15 U = h-l 







Figure 2: This figure shows leaders forming clubs. 
The dark processors are the current leaders and the 
ircles around them represent their local broadcast 

region. All processors within this area are recruited 

i s members of the leader's club. 



Figure 1: Clubs Algorithm 



and synchronous, and that messages are trans- 
mitted instantaneously. These assumptions will 
be removed in Section 4. 

In the clubs algorithm, the processors compete 
to start new groups. The processors compete by 
choosing random numbers from a fixed integer 
range [0,R). Then each processor counts down 
from that number silently. If it reaches zero with- 
out being interrupted, the processor becomes a 
group leader and recruits its local neighborhood 
into its group by broadcasting a "recruit" mes- 
sage. The processors that get recruited are called 
followers. 

If a processor hears a recruit message from a 
neighbor before reaching zero, it becomes a fol- 
lower. Once it has been recruited as a follower, 
it can no longer compete to form a new group. 
Therefore it stops counting down. However it 
keeps listening for additional recruit messages. 
Groups are allowed to overlap, and a processor 
can be a follower of more than one leader. If a 
processor detects a collision (hears a garbled mes- 
sage) while counting down, it assumes that more 
than one of its neighbors tried to recruit it at the 
same time. It becomes a follower and figures out 
its leaders later. The algorithm completes when 
all processors are members of some group (lead- 
ers or followers). Figure 1 presents the code run 



on a single processor and figures 2 and 3 show 
clubs forming on a simulation of an amorphous 
computer. 

4 Analysis 

4.1 Synchronous and Reliable Pro- 
cessors 

Theorem 1: The clubs algorithm completes 
in R steps and produces valid groups, when the 
processors are synchronous. 

Proof 1: If [0,.R) is the range from which 
random numbers are chosen, then the algorithm 
completes in time R. This is because each proces- 
sor chooses to be a follower or leader by the end 
of its countdown and the countdown is chosen to 
be smaller than R. The club leader is adjacent 
to each processor in its club, which guarantees 
local routing as well as the maximum diameter 
of two hops. Clubs can be made non-overlapping 
by followers arbitrarily choosing one of the clubs 
they belong to. This does not violate the re- 
quired group properties because the group leader 
still guarantees that its members are locally con- 
nected in two hops. 

If we remove the assumption that messages are 
instantaneous and each message takes time m to 
transmit (all the messages are the same length in 
the algorithm), then each processor should mul- 




Figure 3: This shows the final clubs formed. All pro- 
cessors are either leaders or members of some club. 
Processors with a darker shade of gray belong to more 
than one club because they are in the overlapping re- 
gion of several leaders' broadcast range. 



tiply its random count, r^, by m. If the proces- 
sors are synchronous, they will broadcast only at 
intervals of m, thus preventing conflicts due to 
partially overlapping messages. The total time 
in that case is mR. By treating messages as in- 
stantaneous, we are simply normalizing the unit 
time to the transmission time, m. 

Leadership conflicts: The algorithm intro- 
duces a natural spacing between clubs. Leaders 
prevent their neighbors from competing to form 
new clubs. Therefore leaders will be non-adjacent 
and at least r distance apart. However if two 
neighboring nodes declare leadership at the same 
time, then two clubs are formed such that their 
leaders are adjacent. This is a leadership conflict. 
The expected number of leadership conflicts is in- 
timately related to the choice of the upper bound 
R. For many applications of clubs it is desirable 
that group leaders belong to only one club (their 
own) and that the overlap between clubs be lim- 
ited [5]. In addition the spacing between clubs 
allows us to derive important properties on the 
graph induced by the clubs (Section 5). There- 
fore, we would like to keep the number of leader- 
ship conflicts low. 

Theorem 2: The expected number of leader- 
ship conflicts, ^conflicts,), is at most (-$W-)N, 
for a synchronous amorphous computer. 



Proof 2: We will first analyze the expected 
leadership conflicts in a simplified version of 
clubs, called sclubs. Then we will show that the 
expected number of leadership conflicts in sclubs 
is at least as large as that in the original clubs 
algorithm. 

In sclubs, leaders do not remove their neigh- 
bors from competition. Each processor chooses a 
random number, counts down and declares lead- 
ership when it reaches zero. Hence there are no 
followers. 

Each processor i chooses a value r, from the 
range [0,R). At step k of the algorithm, proces- 
sor i declares leadership if r% = k. Processor i 
experiences a leadership conflict at step k if it 
chose rt = k and some neighbor did as well. 

We can determine the expected number of 
leadership conflicts at step k. 
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A leadership conflict involves at least 2 proces- 
sors, therefore summing the probabilities of con- 
flicts over all processors overcounts the number 
of conflicts at least twice. We can calculate the 
probability that a node i experiences a leadership 
conflict at step k. 

P(ri = k A 3 j £ n(i),rj = k) 
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Since the random numbers are chosen in the 
first step and the random numbers are less than 
R, there are R total steps. 



E(conflicts) < R 
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Lemma: The expected number of leadership 
conflicts for sclubs is at least as large as the ex- 
pected number of leadership conflicts in the clubs 
algorithm. 



This is derived from the observation that for a 
given set of random values chosen by the nodes, 
the graph for sclubs, V sc i u t s , is a superset of the 
graph for clubs, V c i u t s at every step of the al- 
gorithm. This is easily proven using induction 
(proof omitted). As a result, the number of lead- 
ership conflicts in V sc i u b s is greater than or equal 
to the number of leadership conflicts in V c i u t> s 
at any step k. Since this is true for any initial 
choice of random values, the overall E(conflicts) 
in sclubs is > E(conflicts) for the original clubs. 

Corollary: If we choose R = ad avg , where a 
is a constant and a > 1, then the expected number 
of conflicts is at worst a constant fraction of the 
total number of nodes, (l/2a)N. 

This follows immediately from Theorem 2. 
Thus, a can be chosen to make the percentage 
of leadership conflicts acceptably (or arbitrarily) 
small, at the expense of running time. 

Apart from the leader conflicts, message colli- 
sions also occur when two leaders with overlap- 
ping broadcast ranges (i.e. within two hops of 
each other) broadcast at the same time. The 
choice of a also affects these collisions. Given 
that a processor has at most 4p maa; neighbors 
within two hops and using arguments similar to 
those for calculating leadership conflicts, the ex- 
pected number of collisions is < 4p 2 "^" N. Hence, 
if R = ad avg , then increasing a also decreases 
collisions. 



Number of Messages: The total number of 
messages is one per club. 

The only messages sent in this algorithm are 
leaders declaring the start of a new group, there- 
fore the total number of messages is equal to the 
number of groups formed. In Section 5 we show 
that there is an upper bound on the number of 
clubs formed for a given physical embedding that 
does not depend on N. This gives us an upper 
bound on the number of messages. The algo- 
rithm is efficient in the number of messages be- 
cause it takes advantage of the local broadcast 
mechanism, rather than using a point-to-point 
protocol. 



Effect of the Distribution of Processors: 

The clubs algorithm works for arbitrary graphs 
and is not significantly affected by the proces- 
sor distribution. The algorithm is most efficient 
when d avg is small compared to TV. The distribu- 
tion does affect the resulting groups. The lower 



the variance in distribution, the lower the vari- 
ance in the number of members in a group. 

Global Knowledge: Since the processors 
may be programmed before being embedded in 
a surface, the algorithms should not depend on 
global knowledge of the topology. The clubs al- 
gorithm does not require knowing TV or having 
global IDs. Processors can estimate the value of 
d avg locally, if the variance in the density of pro- 
cessors is small. Alternatively processors can use 
the maximum neighborhood size, p max , which is 
a physical property of the processor known at 
manufacture time. 

No Global IDs: Although there are no global 
IDs, it is useful for leaders and groups to have 
names. A processor can randomly choose an id 
such that, with high probability, no processor 
within a two hop radius has the same id. If a 
processor chooses an id from [0,p^ aa ,), the prob- 
ability that another processor in the two hop ra- 
dius chooses the same value is less than (l/p^ax) 
(i.e. very small). The two hop uniqueness is im- 
portant from a follower's point of view, since it 
may belong to more than one group and therefore 
need to distinguish between two or more leaders. 
A leader can broadcast its id along with the 
recruit message in the clubs algorithm. In the 
case of a collision, a follower broadcasts a request 
after the clubs algorithm has completed to de- 
termine which clubs it belongs to. The follower 
can use the simple strategy of broadcasting a re- 
quest, waiting a random delay and then trying 
again. If the random delay is chosen from the 
range [0,p m ax) and all processors are using the 
same strategy and range, it can be shown that 
the expected number of trials after which a pro- 
cessor i gets through is ( Pma "_ 1 ) , which is 
less than 3 [11]. We will refer to this broadcast 
strategy as random-wait protocol. This and al- 
ternative point-to-point protocols (like exponen- 
tial backoff and CSMA) for broadcast networks 
are presented in detail in [11]. Resolving colli- 
sions using random- wait adds 0(p max ) expected 
steps to the algorithm and increases the number 
of messages by O(conflicts). 

4.2 Asynchronous Processors 

The clubs algorithm is simple to implement in 
an asynchronous environment because no syn- 
chronization between processors is required dur- 
ing the execution of the algorithm. The only 



synchronization required is at the beginning of 
the algorithm. Processors may start at differ- 
ent times, so if a processor announces leadership 
while its neighbors are not listening there will 
be unnecessary leadership conflicts. A processor 
needs to wait only for its immediate neighbors to 
complete previous tasks, before choosing a ran- 
dom number and counting down. This can be de- 
termined if all processors broadcast a done mes- 
sage after completing previous tasks. A proces- 
sor should continue to listen for recruit messages 
even after reaching the timeout. A processor can 
also locally determine when the algorithm is com- 
plete if its neighbors broadcast similar done mes- 
sages after reaching their timeouts. 

Theorem 3: The clubs algorithm completes 
in D + R steps, where D is the delay between 
when the first processor starts counting down and 
the last processor starts counting down, for asyn- 
chronous processors. The expected number of 
leadership conflicts ^conflicts,), is still at most 
(-fff)Af, for the aynschronous amorphous com- 
puter. 

Proof 3: Let delay t be the time between when 
the first processor starts counting down and a 
processor i starts counting down 1 . Each proces- 
sor can be treated as choosing a random number 
r, and a random offset delayi which is equivalent 
to a processor starting at time choosing from 
the range [0,D + R) where D is the maximum 
delay. Hence each processor will have either de- 
clared leadership or become a follower by D + R 
steps. Therefore all processors will belong to a 
group at the end of D + R steps. 

Let processor i start counting down d timesteps 
before some other processor j . Then the choices 
of R for which the two processors can conflict 
must lie within the range [d, R) for processor i 
and [0, R — d). The probability of having a con- 
flict is {-gx{R — d)). If we allow an adversary to 
choose the delay, so as to maximize the prob- 
ability of leadership conflicts, we see that the 
probability is maximized when the delay d = 0. 
The probability of a leadership conflict decreases 
when the processors are not synchronous. Hence 
the expected number of collisions is still at most 
(%)7V. 

A similar argument can be made when the pro- 
cessor speeds are different. Let processor i oper- 
ate at a speed S times that of processor j , where 
S > 1, and let both processors start at time 0. 



1 We assume that delay includes the time required to 
retransmit done messages. 



Then collisions will occur only when processor 
j chooses values from the range [0, [R/S\) and 
processor i chooses values from [0,R) that are 
divisible by S. The probability of collision is 
(-j^-^f)- Again this is maximized when 5 = 1, 
i.e. the speeds are the same. The reason is that 
the range of random choices over which a conflict 
can occur has decreased. The time taken by the 
algorithm to complete is the time taken by the 
slowest processor to count to zero. 

Treating messages as having non-zero trans- 
mission times also does not affect the expected 
number of conflicts. Since a processor knows not 
to send a message while it is currently receiv- 
ing one, conflicts will only occur when two ad- 
jacent processors choose the same time to start 
declaring leadership. If one processor precedes 
the other, then the second processor will sense 
that the channel is busy before sending a mes- 
sage and abort its claim for leadership. Thus, 
leadership conflicts will not occur due to partially 
overlapping recruit messages. 



4.3 Processor Failures 

Only failures of the leaders affect the clubs algo- 
rithm. Until now we have assumed that proces- 
sors are reliable. A processor may stop execut- 
ing at any time (stopping failures). The clubs 
algorithm is robust to most failures because pro- 
cessors execute relatively independently and the 
communication is simple. Processors failing be- 
fore finishing the countdown or after becoming 
followers do not affect the algorithm. However, 
leaders guarantee that a group has local routing 
and a maximum diameter of two hops. If a leader 
fails, the group may potentially become discon- 
nected and the diameter may increase, violating 
two group properties. 

Adaptive Clubs: The clubs algorithm can 
be extended so that whenever a leader fails, its 
followers rerun the clubs algorithm to elect a 
new leader(s). After completing the initial group 
formation, each leader periodically reasserts its 
leadership. If a follower does not hear a leader 
for several time intervals (Ti), it broadcasts a 
challenge to the leader. If it does not hear a re- 
sponse from the leader within a certain timeout 
period (T 2 ), then it assumes the leader is dead 
and broadcasts a message declaring the leader 
dead. Upon hearing this message, only the pro- 
cessors that do not belong to any group need to 
rerun the clubs algorithm. Hence the overlapping 
groups add robustness by decreasing the effect of 



the failure of a single leader. 

Correctness: If a leader dies, members will 
eventually detect it. They will either hear a dec- 
laration that the leader is dead, or timeout them- 
selves and challenge the leader. If they do not 
belong to any group, then they will compete us- 
ing the clubs algorithm, until all of them belong 
to some group. Even if a leader is not dead, a 
processor may falsely think it is dead (it chal- 
lenged the leader and for some reason did not 
hear the response within the timeout T 2 ). As 
a result, several processors may rerun the clubs 
algorithm and create new groups and later re- 
alize that the leader is alive. This results in 
unnecessary clubs (and leadership conflicts) but 
still guarantees that all processors belong to some 
club. 

Parameters: The rate at which the leader re- 
asserts leadership depends on the expected time 
to failure as well as how soon the application 
needs to detect a failure. The leader does not 
have to send a special message, as long as it 
broadcasts some message within the specified pe- 
riod. To challenge a leader or to respond to a 
challenge, a processor can use the random- wait 
protocol or exponential backoff. The timeouts 
T\ and T 2 depend on the expected time for mes- 
sages to get through, given the particular broad- 
cast strategy. The timeouts should be chosen so 
that probability of false death declarations is low 
and the incidence of unnecessary leadership con- 
flicts is low. 

Thus, adaptive clubs can reorganize to accom- 
modate failures. This method can also be used 
for accommodating new processors into an al- 
ready existing clubs structure. 

4.4 Conclusions 

The clubs algorithm produces groups of diam- 
eter two hops in time proportional to the lo- 
cal neighborhood size. The algorithm does not 
require point-to-point communication or syn- 
chronous processors. Rather it takes advantage 
of those properties that are generally difficult to 
deal with. This is achieved by relying on the 
local broadcast mechanism rather than point-to- 
point message exchanges and allowing leadership 
conflicts to occur probabilistically. Hence com- 
plex synchronization is not required and the al- 
gorithm performs efficiently in both synchronous 
and asynchronous settings. 

In addition the algorithm satisfies several other 
constraints that that generally occur in large dis- 



tributed systems. The algorithm does not re- 
quire global IDs, relying on randomization in- 
stead, and does not use require that processors 
know the diameter of the network or the number 
of nodes. The algorithm can be extended to re- 
organize groups automatically in response to the 
processor failures or the addition of new proces- 
sors. 



5 Physical Properties of 
Clubs 

A distinctive property of an amorphous computer 
is that it has geometry as well as topology. The 
geometry is derived from the communication ge- 
ometry and the space within which the amor- 
phous computer is embedded. The geometry can 
be used to derive additional properties of the 
clubs algorithm. 

Assuming that there are no leadership con- 
flicts, we can derive two bounds on the clubs al- 
gorithms: 

Theorem 4: The maximum number of clubs 
formed is fixed for a given surface area and com- 
munication radius, and does not depend on N. 

Theorem 5: The degree of a club is at worst 
21 

The proof for both theorems is based on model- 
ing a processor as a circle of radius r/2, centered 
at the processor (figure 4). An edge implies a 
maximum physical distance of r. Therefore the 
circles will overlap if and only if the processors 
are adjacent. 

Proof 4: If there are no leadership conflicts, 
then all the leaders are non-adjacent. The max- 
imum number of clubs that can be formed in a 
given area is the same as the maximum num- 
ber of leaders one can place in that area without 
violating the constraint that no two leaders be 
adjacent. 

If we model each leader as a circle of radius 
r/2, as described before, the problem of finding 
the maximum number of clubs can be restated as 
a packing problem i.e. what is the densest pack- 
ing of non-intersecting circles of radius r/2 in a 
plane. In a two dimensional plane, the densest 
packing of circles is a hexagonal packing. Hence 
the densest packing of leaders is a hexagonal lat- 
tice where the distance between two adjacent lat- 
tice points is r. This implies that, for a given 
surface and a given communication radius, the 
maximum number of clubs is fixed irrespective of 
N and is equal to the number of grid points on 




Figure 4: A node is modeled as a circle of radius r/2. 

(a) If the circles intersect, the nodes are adjacent. 

(b) If the circles do not intersect, the nodes are not 
adjacent. 



the hexagonal lattice. 

Proof 5: If we consider each club to be a node 
and clubs that overlap to be adjacent, we can talk 
about the graph induced by the clubs. The de- 
gree of a given club is the maximum number of 
clubs that it can be adjacent to. In order for 
two clubs to be adjacent, or overlap, their lead- 
ers must be less than 2r apart (by the triangle in- 
equality). Hence for a particular leader, all lead- 
ers within the circle of radius 2r are potential 
neighbor clubs. However leaders must be at least 
r distance apart. If we model the neighboring 
leaders as non-overlapping circles of radius r/2, 
then all the circles must fit within an annulus 
of inner radius r/2 and outer radius (2r + r/2), 
centered at the given leader. Since each circle 
occupies an area of 7rr 2 /4, no more than 24 non- 
adjacent leaders can be placed in the annulus. 
Therefore a leader can have no more than 24 
neighboring leaders and the degree is at worst 24. 
Hence we see that the degree is upper bounded 
by a constant and does not depend on the area 
or TV. Using a similar argument a processor can 
belong to no more than 9 clubs. 

If the number of leadership conflicts is small, 
these theorems will still hold with high proba- 
bility. In Section 7.1 we provide an extension to 
the clubs algorithm that guarantees no leadership 
conflicts. 

Both bounds are very useful for designing algo- 
rithms on top of clubs. The first bound provides 
an estimate of the number of groups to expect, 
given the area and communication radius. The 
second bound tells us that the graph induced by 
the clubs has small degree. The decomposition of 
processors into groups with small diameter such 
that the graph induced by the groups has small 
degree is very useful in the design of many algo- 
rithms [3, 5]. 
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Figure 5: Leadership conflicts as a percentage of N 
vs. a for different (N,d avg ) pairs 



6 Simulation Results 

We simulated an amorphous computer run- 
ning the clubs algorithm with 1000, 4000 and 
8000 processors, each with average neighborhood 
sizes of 10, 30 and 50, for different values of a. 
The processors are uniformly distributed over a 
unit square surface. The processors are asyn- 
chronous with a small delay (less than a message 
transmission time) and the message transmission 
time is a hundred clock cycles. A collision oc- 
curs if messages partially overlap. The proces- 
sors choose a random value from the range [0, R) 
where R = ad avg and multiply it by the trans- 
mission time. 

The simulation results correspond well with 
the analysis for the synchronous case. Graph 5 
plots the number of leadership conflicts, as a per- 
centage of the total number of processors, for four 
(N, d avg ) pairs 2 . Each data point is averaged over 
several runs. In each run the layout of the pro- 
cessors is changed and new random values are 
chosen. Therefore there is significant variation in 
the number of conflicts in each run. As we can 
see, the average percentage of conflicts varies as 
expected with a and does not seem to be affected 
by TV or d avg . 

In graph 6, we have plotted the number of clubs 
formed in the same experiments against the com- 
munication radius. As we see, even with up to 
10% conflicts, the number of clubs varies with the 
radius close to expectations. Each data point is 
averaged over twenty runs with different proces- 
sor layouts, however there is little variation in the 
number of clubs formed in each run. This is be- 



2 The remaining curves also occupy the same region, so 
for clarity we have plotted only 4 of the 9 curves 
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clubs. The algorithm completes when there are 
no leadership conflicts left. We call this algorithm 
clubs-MIS. 

The clubs algorithm guarantees that all non- 
leader nodes are adjacent to some leader. The 
clubs-MIS algorithm runs a new round of clubs if 
any of the leaders are adjacent. The followers of 
the conflicting leaders are forced to compete as 
well if they are no longer adjacent to some leader. 
The algorithm keeps running until there are no 
conflicting leaders. Hence, the final set of leaders 
forms an MIS. 



Figure 6: Number of clubs formed vs. communica- 
tion radius for processors distributed in a unit square 



cause the number of clubs depends more on the 
surface and communication geometry than pro- 
cessor layout or initial random state. In all of 
the runs, no processor belonged to more than 5 
clubs, in spite of the conflicts. 



Theorem 6 : The expected time to find the 
MIS is 0(p max \ogN) in a synchronous amor- 
phous computer 

Proof 6: In each round, the competing nodes 
choose new random numbers from the range 
[0,R). Therefore, each round is independent. 
The upper bound R is chosen to be ad avg and 
is the same for every round. Let the number of 
nodes in round k be N k ■ In round k + 1 only the 
nodes that experienced a conflict will re-compete. 



7 Extensions 

In this section we describe the extension of the 
clubs algorithm to solve problems like MIS and 
A + 1 vertex coloring. 

7.1 Maximal Independent Set 

A maximal independent set (MIS) is a set of 
nodes in a graph such that no two nodes in the set 
are adjacent (independent) and no more nodes 
can be added to that set without violating in- 
dependence (maximal). Computing the MIS of 
the graph induced by a network is a useful tool 
for solving many distributed computing problems 
[10,9]. 

In an amorphous computer, the set of club 
leaders is almost an MIS on the amorphous com- 
puter graph. All non-leader processors (follow- 
ers) are adjacent to at least one leader, and hence 
cannot be added to the MIS. Most leaders are 
non-adjacent, except those that have a leader- 
ship conflict. Solving for a MIS is equivalent to 
guaranteeing that no leadership conflicts occur. 

This can be achieved by running several rounds 
of the clubs algorithm. After each round, there 
is a conflict-detection stage during which lead- 
ers determine if they experienced a leadership 
conflict. If so they abdicate leadership and they 
and their followers compete in the next round of 



E(N k+1 ) 



E(conflicts in round k) 
1 



2a 



Therefore, 
E(N k+ i) < 1 after (k = log 2Q , TV) rounds 



Hence, all processors are expected to have been 
removed from the graph in O(logiV) rounds. 
Each round of clubs takes R steps where R = 
ad avg . In the conflict resolution stage, only the 
leaders need to exchange messages to determine 
if there were any conflicts. Using the random- 
wait protocol for communication, the conflict- 
detection stage takes 0(p max ) expected steps. 
Therefore the expected time to find the MIS us- 
ing clubs-MIS is 0(p max log AT). 

The clubs algorithm uses a small number of 
messages per round and naturally staggers mes- 
sages to avoid collisions. Furthermore one can 
choose a to reduce the number of rounds. This 
makes clubs particularly suited to the asyn- 
chronous amorphous environment. In the asyn- 
chronous implementation, there needs to be syn- 
chronization at the beginning of each new round 
to make sure all conflicts have been detected be- 
fore running the next round. Both synchroniza- 
tion and conflict-detection are expensive com- 



pared to the clubs algorithm, a can be chosen 
to minimize the overall time. 

Luby [9] presents an algorithm for finding an 
MIS, which also takes 0(p max log N) time in an 
amorphous computer. Processors choose a ran- 
dom value and compare it with their neighbors' 
values. The processors with the minimum values 
become leaders and remove their neighborhood 
from the graph. The remaining processors take 
part in a new round. The algorithm continues 
until there are no processors left. In this algo- 
rithm, each round requires a complete exchange 
of messages between all neighbors which takes 
O(pmax) steps. This is difficult to implement ef- 
ficiently, since processors do not synchronize mes- 
sage sending. Using a protocol like random-wait, 
a significant amount of time is wasted due to mes- 
sage collisions. There is also no control over the 
number of rounds. The clubs algorithm is sim- 
pler to implement in an amorphous computer and 
takes advantage of the local broadcast capability. 

7.2 A + 1 Coloring 

Vertex coloring assigns colors to each node of a 
graph such that no two adjacent nodes have the 
same color. A + 1 vertex coloring implies that 
the graph is colored with A + 1 colors where A is 
the maximum degree of the graph (A < p max )- 
MIS algorithms can be extended to do graph col- 
oring [7] . In this section we will extend the clubs 
algorithm to do A + 1 graph coloring. 

The clubs- coloring algorithm proceeds by run- 
ning multiple rounds of color-picking. After each 
round of color-picking, color conflicts are de- 
tected (i.e. cases where two adjacent nodes have 
the same color) and only the conflicting nodes 
participate in the next round of color-picking. 
The algorithm completes when there are no con- 
flicts, i.e. all nodes have been assigned a valid 
color. 

Color-picking uses a similar countdown mecha- 
nism as clubs. Figure 7 presents the code for ex- 
ecuting a round of color-picking on a single pro- 
cessor. A processor chooses a random number 
from the range [0, R). As it counts down silently, 
it collects colors that it hears from its neighbors. 
When it reaches zero it chooses the smallest color 
not chosen by its neighbors and broadcasts that 
color. Once the round is complete, the proces- 
sor checks for leadership conflicts. If two leaders 
broadcasted at the same time (a leadership con- 
flict), they might have chosen the same color. Or 
some node may not have heard the color they 



integer R (upper bound for random numbers) 


list color Jist = empty (list of neighbor 


colors) 


procedure Club_color_picking () 




1 


U := R 




2 


Ti := random [0,R) 




3 


while (U > 0) 




4 


if (not-empty(msg-queue)) 




5 


newcolor := first (msg^que uc) 




6 


insert (color Jist, newcolor) 




7 


if (r = 0) 




8 


broadcast (smallest color not € 


colorJist) 


9 


Ti := n - 1 




10 


U :=ti-l 





Figure 7: Algorithm for Color Picking 

chose due to the collision, and may have cho- 
sen the same color. In either case the processors 
that broadcasted at the same time must renounce 
their colors and participate in the next round of 
color-picking. Since processors always choose the 
smallest color not chosen by their neighbors, and 
the maximum number of neighbors is A, the color 
values range from 1 to (A + 1). 

Theorem 7 : The expected time for A + 1 col- 
oring is 0(p max \ogN) in a synchronous amor- 
phous computer 

Proof 7: Color-picking is similar to the sim- 
plified clubs, sclubs, presented in Section 4 be- 
cause processors that count down to zero do 
not prevent their neighbors from continuing to 
count. Hence, if R is chosen to be ad avg , the ex- 
pected number of conflicts is less than or equal 
to (l/2a)N. By the same argument as clubs- 
MIS, the number of nodes in each round is less 
than a constant fraction of the previous nodes, 
therefore all nodes will be removed in O(logiV) 
expected rounds. Each round takes 0(p max ) ex- 
pected time, therefore the total expected time is 

O(pmax^OgN). 



8 Example Applications 
the Clubs Algorithm 



of 



Clubs can be used for task specialization, in- 
creased robustness, or resource allocation. In this 
section we provide three examples of using the 
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clubs to address the issue of efficient communi- 
cation in an amorphous computer. Several other 
examples are presented in [5]. 

The clubs can be used as a higher level point- 
to-point network. The leaders communicate 
point-to-point with each other and relay mes- 
sages to and from their members. The com- 
munication between adjacent leaders is accom- 
plished via elected representatives in the over- 
lap regions. This significantly reduces the num- 
ber of messages and potential collisions. For ex- 
ample a full broadcast operation can be imple- 
mented constructing a spanning tree on the graph 
induced by the leaders. Collisions can be fur- 
ther reduced if leaders run a coloring algorithm 
to choose non-interfering channels and members 
choose a single clubs to belong to. Within the 
group, a leader can poll its members to prevent 
collisions between members. This is analogous 
to self-organizing a cellular network with clubs 
as cells and leaders as base stations. The upper 
bound of 24 on the degree of a club tells us the 
maximum number of distinct channels required. 

An extension of this idea is to use the clubs 
algorithm to self-organize a hierarchical network 
for efficient non-local communication. The group 
leaders can run the clubs algorithm to form 
higher level groups. In a separate paper [5] we 
show how the clubs local leader election mecha- 
nism can be extended to form groups of a given 
diameter h. The efficiency of the resulting net- 
work will depend on the number of levels in the 
hierarchy and the diameter of groups at each 
level. 

It is also possible to use the clubs-based col- 
oring algorithm to create an efficient local point- 
to-point communication for applications that re- 
quire frequent local exchanges of values, such 
as partial differential equation (PDE) calcula- 
tions and cellular automata style local rules. 
For such algorithms protocols like random-wait 
are inefficient due to high percentage of colli- 
sions. The A + 1 coloring can be used to imple- 
ment CDMA (code division multiple access) in 
an asynchronous amorphous computer [11, 6]. In 
CDMA, messages modulated with different digi- 
tal codes can be broadcast simultaneously with- 
out interfering. Processors use their color to de- 
termine which code to listen on. The sender 
broadcasts using the code of the intended re- 
ceiver. The number of codes required A + 1, 
which is upper bounded by p max - By assigning 
nearby processors different channels to listen on 
the probability of collisions can be significantly 



reduced. However a receiver can still only re- 
ceive from a single sender at any time, hence it 
is not the same model as a wired point-to-point 
network. Nevertheless it can be used to apply 
point-to-point algorithms more efficiently on the 
amorphous computer. Our hardware prototype 
will support spread spectrum CDMA and use this 
mechanism to assign channels. 

9 Conclusion 

In this paper we presented the clubs algorithm 
for forming groups in an amorphous computer. 
The clubs algorithm performs efficiently by tak- 
ing advantage of the local broadcast mechanism 
and directly addressing the problem of message 
loss through collisions. The simplicity of the lo- 
cal leader election mechanism makes it easy to 
extend to asynchronous processors without com- 
plex synchronization. In addition the algorithm 
does not use global IDs and can be extended to 
deal with processor failures. The algorithm can 
also be used in point-to-point distributed envi- 
ronments with similar constraints. 

In addition, we derive upper bounds on the 
number of groups formed and the density of 
groups formed by the clubs algorithm, using the 
physical embedding of the amorphous computer. 
We present simulation results for the clubs algo- 
rithm that concur with the analysis. Extensions 
of the clubs algorithm to solve for a maximal in- 
dependent set and produce a A + 1 coloring are 
presented. Lastly, we present three examples of 
applying the clubs algorithm to address commu- 
nication issues in an amorphous computer. 
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